AITopics | vector database

Collaborating Authors

vector database

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

RAG-HAR: Retrieval Augmented Generation-based Human Activity Recognition

Sivaroopan, Nirhoshan, Karunarathna, Hansi, Madarasingha, Chamara, Jayasumana, Anura, Thilakarathna, Kanchana

arXiv.org Artificial IntelligenceDec-11-2025

Abstract--Human Activity Recognition (HAR) underpins applications in healthcare, rehabilitation, fitness tracking, and smart environments, yet existing deep learning approaches demand dataset-specific training, large labeled corpora, and significant computational resources. We introduce RAG-HAR, a training-free retrieval-augmented framework that leverages large language models (LLMs) for HAR. RAG-HAR computes lightweight statistical descriptors, retrieves semantically similar samples from a vector database, and uses this contextual evidence to make LLM based activity identification. We further enhance RAG-HAR by first applying prompt optimization and introducing an LLM-based activity descriptor that generates context-enriched vector databases for delivering accurate and highly relevant contextual information. Along with these mechanisms, RAG-HAR achieves state-of-the-art performance across six diverse HAR benchmarks. RAG-HAR moves beyond known behaviors, enabling the recognition and meaningful labelling of multiple unseen human activities. Human Activity Recognition (HAR) from wearable sensor data enables continuous monitoring, anomaly detection, and personalized interventions across healthcare [3], rehabilitation [31], fitness [28], and smart environments [14]. Despite wide-ranging applications, HAR remains challenging due to inter-subject variability, differences in sensor placement, device heterogeneity, and subtle distinctions between activities that exhibit similar motion patterns [39]. Those challenges create a strong need for accurate, generalizable, and cost-efficient solutions. Deep learning (DL) has become the dominant paradigm for HAR, with convolutional neural networks (CNNs) [6], [43], recurrent architectures [15], [17], and attention-based models [2] achieving state-of-the-art (SOT A) performance on benchmark datasets. However, DL-based HAR faces three critical limitations: (i) costly and time-consuming training procedures tailored to each dataset; (ii) performance degradation under domain shift across subjects, sensor placements, or devices; and (iii) heavy dependence on large labeled datasets [7], [35]. Despite advances in DL, these limitations leave HAR without a practical solution that is simultaneously training-free, generalizable, and scalable. To address this gap, this paper explores a fundamentally different paradigm: leveraging Large Language Models (LLMs) as reasoning engines for HAR.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2512.08984

Genre:

Research Report (1.00)
Workflow (0.68)

Industry:

Information Technology > Smart Houses & Appliances (0.68)
Health & Medicine > Consumer Health (0.66)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Toward an AI-Native Internet: Rethinking the Web Architecture for Semantic Retrieval

Bilal, Muhammad, Qazi, Zafar, Canini, Marco

arXiv.org Artificial IntelligenceNov-25-2025

The rise of Generative AI Search is fundamentally transforming how users and intelligent systems interact with the Internet. LLMs increasingly act as intermediaries between humans and web information. Yet the web remains optimized for human browsing rather than AI-driven semantic retrieval, resulting in wasted network bandwidth, lower information quality, and unnecessary complexity for developers. We introduce the concept of an AI-Native Internet, a web architecture in which servers expose semantically relevant information chunks rather than full documents, supported by a Web-native semantic resolver that allows AI applications to discover relevant information sources before retrieving fine-grained chunks. Through motivational experiments, we quantify the inefficiencies of current HTML-based retrieval, and outline architectural directions and open challenges for evolving today's document-centric web into an AI-oriented substrate that better supports semantic access to web content.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2511.18354

Genre: Research Report (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications > Web (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Towards Hyper-Efficient RAG Systems in VecDBs: Distributed Parallel Multi-Resolution Vector Search

Liu, Dong, Yu, Yanxuan

arXiv.org Artificial IntelligenceNov-24-2025

Retrieval-Augmented Generation (RAG) systems have become a dominant approach to augment large language models (LLMs) with external knowledge. However, existing vector database (VecDB) retrieval pipelines rely on flat or single-resolution indexing structures, which cannot adapt to the varying semantic granularity required by diverse user queries. This limitation leads to suboptimal trade-offs between retrieval speed and contextual relevance. To address this, we propose \textbf{Semantic Pyramid Indexing (SPI)}, a novel multi-resolution vector indexing framework that introduces query-adaptive resolution control for RAG in VecDBs. Unlike existing hierarchical methods that require offline tuning or separate model training, SPI constructs a semantic pyramid over document embeddings and dynamically selects the optimal resolution level per query through a lightweight classifier. This adaptive approach enables progressive retrieval from coarse-to-fine representations, significantly accelerating search while maintaining semantic coverage. We implement SPI as a plugin for both FAISS and Qdrant backends and evaluate it across multiple RAG tasks including MS MARCO, Natural Questions, and multimodal retrieval benchmarks. SPI achieves up to \textbf{5.7$\times$} retrieval speedup and \textbf{1.8$\times$} memory efficiency gain while improving end-to-end QA F1 scores by up to \textbf{2.5 points} compared to strong baselines. Our theoretical analysis provides guarantees on retrieval quality and latency bounds, while extensive ablation studies validate the contribution of each component. The framework's compatibility with existing VecDB infrastructures makes it readily deployable in production RAG systems. Code is availabe at \href{https://github.com/FastLM/SPI_VecDB}{https://github.com/FastLM/SPI\_VecDB}.

information retrieval, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2511.16681

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.67)

Add feedback

Bridging Industrial Expertise and XR with LLM-Powered Conversational Agents

Tomkou, Despina, Fatouros, George, Andreou, Andreas, Makridis, Georgios, Liarokapis, Fotis, Dardanis, Dimitrios, Kiourtis, Athanasios, Soldatos, John, Kyriazis, Dimosthenis

arXiv.org Artificial IntelligenceNov-11-2025

--This paper introduces a novel integration of Retrieval-Augmented Generation (RAG) enhanced Large Language Models (LLMs) with Extended Reality (XR) technologies to address knowledge transfer challenges in industrial environments. The proposed system embeds domain-specific industrial knowledge into XR environments through a natural language interface, enabling hands-free, context-aware expert guidance for workers. We present the architecture of the proposed system consisting of an LLM Chat Engine with dynamic tool orchestration and an XR application featuring voice-driven interaction. Performance evaluation of various chunking strategies, embedding models, and vector databases reveals that semantic chunking, balanced embedding models, and efficient vector stores deliver optimal performance for industrial knowledge retrieval. The system's potential is demonstrated through early implementation in multiple industrial use cases, including robotic assembly, smart infrastructure maintenance, and aerospace component servicing. Results indicate potential for enhancing training efficiency, remote assistance capabilities, and operational guidance in alignment with Industry 5.0's human-centric and resilient approach to industrial development.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/DCOSS-IoT65416.2025.00158

2504.05527

Country: Europe > Middle East > Cyprus (0.14)

Genre:

Instructional Material (0.93)
Research Report (0.64)

Industry: Education > Educational Setting (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PPMI: Privacy-Preserving LLM Interaction with Socratic Chain-of-Thought Reasoning and Homomorphically Encrypted Vector Databases

Bae, Yubeen, Kim, Minchan, Lee, Jaejin, Kim, Sangbum, Kim, Jaehyung, Choi, Yejin, Mireshghallah, Niloofar

arXiv.org Artificial IntelligenceNov-4-2025

Large language models (LLMs) are increasingly used as personal agents, accessing sensitive user data such as calendars, emails, and medical records. Users currently face a trade-off: They can send private records, many of which are stored in remote databases, to powerful but untrusted LLM providers, increasing their exposure risk. Alternatively, they can run less powerful models locally on trusted devices. We bridge this gap. Our Socratic Chain-of-Thought Reasoning first sends a generic, non-private user query to a powerful, untrusted LLM, which generates a Chain-of-Thought (CoT) prompt and detailed sub-queries without accessing user data. Next, we embed these sub-queries and perform encrypted sub-second semantic search using our Homomorphically Encrypted Vector Database across one million entries of a single user's private data. This represents a realistic scale of personal documents, emails, and records accumulated over years of digital activity. Finally, we feed the CoT prompt and the decrypted records to a local language model and generate the final response. On the LoCoMo long-context QA benchmark, our hybrid framework, combining GPT-4o with a local Llama-3.2-1B model, outperforms using GPT-4o alone by up to 7.1 percentage points. This demonstrates a first step toward systems where tasks are decomposed and split between untrusted strong LLMs and weak local ones, preserving user privacy.

arxiv preprint arxiv, large language model, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2506.17336

Country: Asia (0.67)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Category-Aware Semantic Caching for Heterogeneous LLM Workloads

Wang, Chen, Liu, Xunzhuo, Zhu, Yue, Youssef, Alaa, Nagpurkar, Priya, Chen, Huamin

arXiv.org Artificial IntelligenceNov-3-2025

LLM serving systems process heterogeneous query workloads where different categories exhibit different characteristics. Code queries cluster densely in embedding space while conversational queries distribute sparsely. Content staleness varies from minutes (stock data) to months (code patterns). Query repetition patterns range from power-law (code) to uniform (conversation), producing long tail cache hit rate distributions: high-repetition categories achieve 40-60% hit rates while low-repetition or volatile categories achieve 5-15% hit rates. Vector databases must exclude the long tail because remote search costs (30ms) require 15--20% hit rates to break even, leaving 20-30% of production traffic uncached. Uniform cache policies compound this problem: fixed thresholds cause false positives in dense spaces and miss valid paraphrases in sparse spaces; fixed TTLs waste memory or serve stale data. This paper presents category-aware semantic caching where similarity thresholds, TTLs, and quotas vary by query category. We present a hybrid architecture separating in-memory HNSW search from external document storage, reducing miss cost from 30ms to 2ms. This reduction makes low-hit-rate categories economically viable (break-even at 3-5% versus 15-20%), enabling cache coverage across the entire workload distribution. Adaptive load-based policies extend this framework to respond to downstream model load, dynamically adjusting thresholds and TTLs to reduce traffic to overloaded models by 9-17% in theoretical projections.

category, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2510.26835

Genre: Research Report (0.40)

Industry:

Health & Medicine (0.68)
Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.50)

Add feedback

Beyond Long Context: When Semantics Matter More than Tokens

Chawdhury, Tarun Kumar, Duke, Jon D.

arXiv.org Artificial IntelligenceOct-31-2025

Electronic Health Records (EHR) store clinical documentation as base64 encoded attachments in FHIR DocumentReference resources, which makes semantic question answering difficult. Traditional vector database methods often miss nuanced clinical relationships. The Clinical Entity Augmented Retrieval (CLEAR) method, introduced by Lopez et al. 2025, uses entity aware retrieval and achieved improved performance with an F1 score of 0.90 versus 0.86 for embedding based retrieval, while using over 70 percent fewer tokens. We developed a Clinical Notes QA Evaluation Platform to validate CLEAR against zero shot large context inference and traditional chunk based retrieval augmented generation. The platform was tested on 12 clinical notes ranging from 10,000 to 65,000 tokens representing realistic EHR content. CLEAR achieved a 58.3 percent win rate, an average semantic similarity of 0.878, and used 78 percent fewer tokens than wide context processing. The largest performance gains occurred on long notes, with a 75 percent win rate for documents exceeding 65,000 tokens. These findings confirm that entity aware retrieval improves both efficiency and accuracy in clinical natural language processing. The evaluation framework provides a reusable and transparent benchmark for assessing clinical question answering systems where semantic precision and computational efficiency are critical.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2510.25816

Genre:

Research Report > New Finding (0.88)
Research Report > Experimental Study (0.69)

Industry: Health & Medicine > Health Care Technology > Medical Record (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.91)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Rethinking and Exploring String-Based Malware Family Classification in the Era of LLMs and RAG

Chen, Yufan, Wu, Daoyuan, Zhong, Juantao, Zhang, Zicheng, Gao, Debin, Wang, Shuai, Li, Yingjiu, Liu, Ning, Chen, Jiachi, Chang, Rocky K. C.

arXiv.org Artificial IntelligenceOct-28-2025

Malware family classification aims to identify the specific family (e.g., GuLoader or BitRAT) a malware sample may belong to, in contrast to malware detection or sample classification, which only predicts a Yes/No outcome. Accurate family identification can greatly facilitate automated sample labeling and understanding on crowdsourced malware analysis platforms such as VirusTotal and MalwareBazaar, which generate vast amounts of data daily. In this paper, we explore and assess the feasibility of using traditional binary string features for family classification in the new era of large language models (LLMs) and Retrieval-Augmented Generation (RAG). Specifically, we investigate howFamily-Specific String (FSS) features can be utilized in a manner similar to RAG to facilitate family classification. To this end, we develop a curated evaluation framework covering 4,347 samples from 67 malware families, extract and analyze over 25 million strings, and conduct detailed ablation studies to assess the impact of different design choices in four major modules, with each providing a relative improvement ranging from 8.1% to 120%.

classification, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2507.04055

Country: Asia (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

AI-Enhanced Operator Assistance for UNICOS Applications

Tam, Bernard, Tournier, Jean-Charles, Rodriguez, Fernando Varela

arXiv.org Artificial IntelligenceOct-28-2025

This project explores the development of an AI-enhanced operator assistant for UNICOS, CERN's UNified Industrial Control System. While powerful, UNICOS presents a number of challenges, including the cognitive burden of decoding widgets, manual effort required for root cause analysis, and difficulties maintainers face in tracing datapoint elements (DPEs) across a complex codebase. In situations where timely responses are critical, these challenges can increase cognitive load and slow down diagnostics. To address these issues, a multi-agent system was designed and implemented. The solution is supported by a modular architecture comprising a UNICOS-side extension written in CTRL code, a Python-based multi-agent system deployed on a virtual machine, and a vector database storing both operator documentation and widget animation code. Preliminary evaluations suggest that the system is capable of decoding widgets, performing root cause analysis by leveraging live device data and documentation, and tracing DPEs across a complex codebase. Together, these capabilities reduce the manual workload of operators and maintainers, enhance situational awareness in operations, and accelerate responses to alarms and anomalies. Beyond these immediate gains, this work highlights the potential of introducing multi-modal reasoning and retrieval augmented generation (RAG) into the domain of industrial control. Ultimately, this work represents more than a proof of concept: it provides a basis for advancing intelligent operator interfaces at CERN. By combining modular design, extensibility, and practical AI integration, this project not only alleviates current operator pain points but also points toward broader opportunities for assistive AI in accelerator operations.

artificial intelligence, machine learning, widget, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.5281/zenodo.17120885

2510.21717

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Provenance of AI-Generated Images: A Vector Similarity and Blockchain-based Approach

Sharma, Jitendra, Carvalho, Arthur, Bhunia, Suman

arXiv.org Artificial IntelligenceOct-22-2025

Rapid advancement in generative AI and large language models (LLMs) has enabled the generation of highly realistic and contextually relevant digital content. LLMs such as ChatGPT with DALL-E integration and Stable Diffusion techniques can produce images that are often indistinguishable from those created by humans, which poses challenges for digital content authentication. Verifying the integrity and origin of digital data to ensure it remains unaltered and genuine is crucial to maintaining trust and legality in digital media. In this paper, we propose an embedding-based AI image detection framework that utilizes image embeddings and a vector similarity to distinguish AI-generated images from real (human-created) ones. Our methodology is built on the hypothesis that AI-generated images demonstrate closer embedding proximity to other AI-generated content, while human-created images cluster similarly within their domain. To validate this hypothesis, we developed a system that processes a diverse dataset of AI and human-generated images through five benchmark embedding models. Extensive experimentation demonstrates the robustness of our approach, and our results confirm that moderate to high perturbations minimally impact the embedding signatures, with perturbed images maintaining close similarity matches to their original versions. Our solution provides a generalizable framework for AI-generated image detection that balances accuracy with computational efficiency.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.17854

Country: North America > United States > Ohio (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.88)

Add feedback